16 research outputs found

    Evaluation of modelling approaches for predicting the spatial distribution of soil organic carbon stocks at the national scale

    Get PDF
    Soil organic carbon (SOC) plays a major role in the global carbon budget. It can act as a source or a sink of atmospheric carbon, thereby possibly influencing the course of climate change. Improving the tools that model the spatial distributions of SOC stocks at national scales is a priority, both for monitoring changes in SOC and as an input for global carbon cycles studies. In this paper, we compare and evaluate two recent and promising modelling approaches. First, we considered several increasingly complex boosted regression trees (BRT), a convenient and efficient multiple regression model from the statistical learning field. Further, we considered a robust geostatistical approach coupled to the BRT models. Testing the different approaches was performed on the dataset from the French Soil Monitoring Network, with a consistent cross-validation procedure. We showed that when a limited number of predictors were included in the BRT model, the standalone BRT predictions were significantly improved by robust geostatistical modelling of the residuals. However, when data for several SOC drivers were included, the standalone BRT model predictions were not significantly improved by geostatistical modelling. Therefore, in this latter situation, the BRT predictions might be considered adequate without the need for geostatistical modelling, provided that i) care is exercised in model fitting and validating, and ii) the dataset does not allow for modelling of local spatial autocorrelations, as is the case for many national systematic sampling schemes

    Spatial distribution of soil organic carbon stocks in France

    Get PDF
    Soil organic carbon plays a major role in the global carbon budget, and can act as a source or a sink of atmospheric carbon, thereby possibly influencing the course of climate change. Changes in soil organic carbon (SOC) stocks are now taken into account in international negotiations regarding climate change. Consequently, developing sampling schemes and models for estimating the spatial distribution of SOC stocks is a priority. The French soil monitoring network has been established on a 16 km × 16 km grid and the first sampling campaign has recently been completed, providing around 2200 measurements of stocks of soil organic carbon, obtained through an in situ composite sampling, uniformly distributed over the French territory. <br><br> We calibrated a boosted regression tree model on the observed stocks, modelling SOC stocks as a function of other variables such as climatic parameters, vegetation net primary productivity, soil properties and land use. The calibrated model was evaluated through cross-validation and eventually used for estimating SOC stocks for mainland France. Two other models were calibrated on forest and agricultural soils separately, in order to assess more precisely the influence of pedo-climatic variables on SOC for such soils. <br><br> The boosted regression tree model showed good predictive ability, and enabled quantification of relationships between SOC stocks and pedo-climatic variables (plus their interactions) over the French territory. These relationships strongly depended on the land use, and more specifically, differed between forest soils and cultivated soil. The total estimate of SOC stocks in France was 3.260 ± 0.872 PgC for the first 30 cm. It was compared to another estimate, based on the previously published European soil organic carbon and bulk density maps, of 5.303 PgC. We demonstrate that the present estimate might better represent the actual SOC stock distributions of France, and consequently that the previously published approach at the European level greatly overestimates SOC stocks

    Large trends in French topsoil characteristics are revealed by spatially constrained multivariate analysis

    No full text
    Spatially constrained multivariate analysis methods (MULTISPATI-PCA) and classical principal component analysis are applied for the entire country of France to study the main soil characteristics of topsoil and to assess if their multivariate spatial pattern can provide insight on their extent and origin. The results of the MULTSPATI-PCA provided evidence of strong spatial structures attributed to different natural processes. The first axis was interpreted as an axis of global soil richness in clay content. Axis 2 reflected the influence of some parent materials on the geochemical content of K and Al. Axis 3 showed a very large gradient of relative content in coarse silt. Axis 4 was driven by gradients of maritime influence. We show that MULTISPATI-PCA allows better than classical PCA to detect and map large regional trends in the distribution of topsoil characteristics. The two first axes were expected and the maps obtained by both methods were consistent. Interestingly, the other gradients were not expected and were better shown by MULTISPATI-PCA than by classical PCA

    Estimating Forest Soil Bulk Density Using Boosted Regression Modelling

    No full text
    Soil bulk density (¿) is an important physical property, but its measurement is frequently lacking in soil surveys due to the time-consuming nature of making the measurement. As a result pedotransfer functions (PTFs) have been developed to predict ¿ from other more easily available soil properties. These functions are generally derived from regression methods that aim to fit a single model. In this study we use a technique called GBM (Generalized Boosted Regression Modelling; Ridgeway, 2006) which combines two algorithms: regression trees and boosting. We built two models and compared their predictive performance with published PTFs. All the functions were fitted based on the French forest soil dataset for the European demonstration Biosoil project. The two GBM models were Model G3 which involved the three most frequent quantitative predictors used to estimate soil bulk density (organic carbon, clay and silt), and Model G10, which included ten qualitative and quantitative input variables such as parent material or tree species. Based on the full dataset, Models G3 and G10 gave R² values of 0.45 and 0.86, respectively. Model G3 did not significantly outperform the best published model. Even when fitted from an external dataset, it explained only 29% of the variation of ¿ with a root mean square error of 0.244 g cm-3. In contrast, the more complex Model G10 outperformed the other models during external validation, with a R² of 0.67 and a predictive deviation of ± 0.168 g cm-3. The variation of forest soil bulk densities was mainly explained by five input variables: organic carbon content, tree species, the coarse fragment content, parent material and sampling depth.JRC.H.7-Climate Risk Managemen

    Spatial distribution of soil organic carbon stocks in France : Discussion paper

    No full text
    © Author(s) 2010. This work is distributed under the Creative Commons Attribution 3.0 LicensePeer reviewedPublisher PD

    Statistical sampling design impact on predictive quality of harmonization functions between soil monitoring networks

    No full text
    Regulations about soil quality are normally imposed at international level while many countries have set up monitoring networks at national scale. Since these networks use different sampling strategies, there is a strong need to harmonize a posteriori the collected data from the national networks in order to answer questions raised by the global regulations. For that purpose, calibration sites where different sampling strategies are carried out are necessary in order to construct harmonization functions between measurements from different sampling protocols. A case study is available for French forest soils that have been sampled twice simultaneously on the same sampling grid but with different sampling and analytical strategies: a first sampling for the French soil quality monitoring network (RMQS) and a second one for the European forest monitoring network (ICP Forests level I second survey i.e. Biosoil). However, the way to define the number and the position of these calibration sites remains a key issue. In this work, we compare both RMQS and Biosoil strategies for a set of measured variables of interest (carbon, potassium and lead contents and pH) and aim to define the minimum number of sites and their best location to establish reliable harmonization functions. Three statistical methods for construction of sampling designs are tested: random sampling, conditioned Latin Hypercube Sampling (cLHS, Minasny and McBratney, 2006) and D-Latin Hypercube Sampling (DLHS, Minasny and McBratney, 2010). With each method, we investigate the effects of the number of calibration data on the predictive quality of the harmonization functions. First, we show that both cLHS and DLHS are better than simple random sampling. Then, the difference between cLHS and DLHS performance depends mainly on the size of the samples, the nature of the soil property and the form of the pedotransfer functions. (C) 2013 Elsevier B.V. All rights reserved

    Mapping soil Pb stocks and availability in mainland France combining regression trees with robust geostatistics

    No full text
    Maps of lead (Pb) stocks in soils and estimates of its availability are needed to assess risks of contamination. Stocks in soils of total and ethylenediamine tetraacetic acid (EDTA) extractable Pb, as well as Pb availability, assessed by EDTA/total Pb ratio, were measured and calculated to a depth of 30. cm with the French soil monitoring network at sites defined by a regular 16 × 16. km grid. Setting aside punctual anomalies by winsorizing, these properties were mapped using linear mixed models (LMM). LMMs combined conditional partitioning trees upon 5 predictors (pH, texture, parent material, land use, population density) with robust geostatistics to avoid distortion due to outlying values. Rather than selecting the fixed effects according to expert-knowledge, regression trees were used to account for explanatory variables in a single classification. This original method stressed both the necessity for a geostatistical component to complement regression tree models when spatial correlation is evident, and the usefulness of these trees to interpret maps. Pb stocks varied widely with peak concentrations and availability in densely populated areas. Lithology, texture and forestation also affected total Pb stocks. With regards to availability, forestation and pH appeared as key factors. © 2011 Elsevier B.V
    corecore